A review: preprocessing techniques and data augmentation for sentiment analysis
نویسندگان
چکیده
Abstract In literature, the machine learning-based studies of sentiment analysis are usually supervised learning which must have pre-labeled datasets to be large enough in certain domains. Obviously, this task is tedious, expensive and time-consuming build, hard handle unseen data. This paper has approached semi-supervised for Vietnamese limited datasets. We summarized many preprocessing techniques were performed clean normalize data, negation handling, intensification handling improve performances. Moreover, data augmentation techniques, generate new from original enrich training without user intervention, also been presented. experiments, we various aspects obtained competitive results may motivate next propositions.
منابع مشابه
A Comparison between Preprocessing Techniques for Sentiment Analysis in Twitter
In recent years, Sentiment Analysis has become one of the most interesting topics in AI research due to its promising commercial benefits. An important step in a Sentiment Analysis system for text mining is the preprocessing phase, but it is often underestimated and not extensively covered in literature. In this work, our aim is to highlight the importance of preprocessing techniques and show h...
متن کاملOptimising Sentiment Classification using Preprocessing Techniques
Sentiment Classification refers to the computational techniques for classifying whether the sentiments of text are positive or negative. Sentiment Classification being a specialized domain of text mining is expected to benefit after preprocessing. In this paper we propose various models with selective combinations of preprocessing techniques and Sentiment Classifiers, to optimize Sentiment Clas...
متن کاملCritical Review of Sentiment Analysis Techniques
The innovation of Web 2.0 is increasing day by day and its latest hype is Microblogging. Short messages exchange platforms connect people worldwide in an unprecedented manner by publishing short text updates regarding various topics. People freely express their views and opinions on microblogging sites that give rise to multiple user sentiments. In order to precisely determine the mood and natu...
متن کاملEvaluating Topic Modeling as Preprocessing for a Sentiment Analysis Task
Classifying the sentiment of documents is a well-studied problem in Natural Language Processing (NLP). The existence of excellent discriminative classifiers like Maxent has pushed the main body of research in the direction of feature engineering. In this paper, I examine an unusual class of features, the document-topic proportions assigned by the Latent Dirichlet Allocation topic model. In part...
متن کاملFeature Detection Techniques for Preprocessing Proteomic Data
Numerous gel-based and nongel-based technologies are used to detect protein changes potentially associated with disease. The raw data, however, are abundant with technical and structural complexities, making statistical analysis a difficult task. Low-level analysis issues (including normalization, background correction, gel and/or spectral alignment, feature detection, and image registration) a...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: Computational Social Networks
سال: 2021
ISSN: ['2197-4314']
DOI: https://doi.org/10.1186/s40649-020-00080-x